Add split-and-retry path to GpuProjectExec by thirtiseven · Pull Request #14724 · NVIDIA/spark-rapids

thirtiseven · 2026-05-05T04:00:59Z

Description

GpuProjectExec currently retries OOM via withRetryNoSplit only — if a projection runs cuDF kernels with internal scratch allocations the pre-split estimator cannot see (regex, string-replace, etc.), an OOM during the projection cannot be recovered and fails the task.

This PR adds a split-and-retry path: for purely deterministic projections, GpuProjectExec now drives the projection through RmmRapidsRetryIterator.withRetry(splitSpillableInHalfByRows). On GPU OOM the input batch is halved by rows and the projection is re-run on each half; the resulting sub-batches are concatenated back via ConcatAndConsumeAll.buildNonEmptyBatchFromTypes to preserve the single-batch contract of projectAndCloseWithRetrySingleBatch.

Mixed deterministic + non-deterministic projections fall through to the existing withRetryNoSplit path: the non-deterministic side is computed once on the full input batch and stitched row-by-row to the deterministic side, and row-splitting either side would break that alignment. Both GpuProjectExec.projectWithRetrySingleBatch and GpuTieredProject.projectWithRetrySingleBatchInternal dispatch on the same condition (forall(_.deterministic) / areAllDeterministic).

A new internal config spark.rapids.sql.projectExec.splitRetry.enabled (default true) gates the new path so it can be disabled to revert to the prior behavior if regressions surface.

Checklists

Documentation

Updated for new or modified user-facing features or behaviors
No user-facing change

The new config is internal() and the public behavior is unchanged in non-OOM cases.

Testing

Added or modified tests to cover new code paths
Covered by existing tests
Not required

tests/src/test/scala/com/nvidia/spark/rapids/ProjectSplitRetrySuite.scala adds five cases covering both dispatch sites:

split-retry on GpuProjectExec.projectAndCloseWithRetrySingleBatch with forceSplitAndRetryOOM produces output equal to a single-batch projection
with the conf disabled, the same injection propagates GpuSplitAndRetryOOM through the legacy path
the same coverage on GpuTieredProject.projectAndCloseWithRetrySingleBatch
a mixed deterministic + GpuRand projection routes through the legacy retry path (verified by comparing the rand column to a non-injected reference run)
a plain forceRetryOOM on the new path returns a single piece without splitting

Performance

Tests ran and results are added in the PR description
Issue filed with a link in the PR description
Not required

TBD — perf testing to follow.

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

greptile-apps · 2026-05-05T04:07:52Z

Greptile Summary

This PR adds a split-and-retry path to GpuProjectExec and GpuTieredProject for purely deterministic projections: on GPU OOM the input batch is halved by rows, each half is projected independently, and results are concatenated back via ConcatAndConsumeAll.buildNonEmptyBatchFromTypes. The sb.incRefCount() guard correctly compensates for the outer withResource(sb) close in both dispatch sites. One resource-lifecycle concern: in runWithSplitRetry, closeOnExcept(pieces) wraps the buildNonEmptyBatchFromTypes call, which already closes all input batches in its own finally block — so if Table.concatenate throws OOM, both paths fire close() on the same underlying GpuColumnVector objects.

Confidence Score: 4/5

Safe to merge after verifying the double-close of pieces batches in runWithSplitRetry when Table.concatenate throws; all other changes are logically correct and well-tested.

The incRefCount() guard at both call sites is correctly applied and the new split-retry logic is sound. The one issue is in runWithSplitRetry: closeOnExcept(pieces) still holds live references when buildNonEmptyBatchFromTypes is called; if the concat throws, that function's finally closes the batches and then closeOnExcept fires a second close on the same cuDF-native objects, driving ref-counts negative.

sql-plugin/src/main/scala/com/nvidia/spark/rapids/basicPhysicalOperators.scala — specifically the closeOnExcept(pieces) scope in runWithSplitRetry.

Important Files Changed

Filename	Overview
sql-plugin/src/main/scala/com/nvidia/spark/rapids/basicPhysicalOperators.scala	Adds split-and-retry path to GpuProjectExec and GpuTieredProject; ref-count handling for sb is correct via incRefCount(), but runWithSplitRetry has a double-close risk on the pieces ArrayBuffer when buildNonEmptyBatchFromTypes throws in the multi-batch concat path.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsConf.scala	Adds PROJECT_SPLIT_RETRY_ENABLED internal conf (default true) with clear doc string and lazy accessor; follows existing config conventions correctly.
tests/src/test/scala/com/nvidia/spark/rapids/ProjectSplitRetrySuite.scala	Five focused unit tests covering split-retry, conf-disabled fallback, tiered projection, mixed deterministic+GpuRand routing, and plain GpuRetryOOM; afterEach resets retry counters properly.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[projectAndCloseWithRetrySingleBatch\nwithResource sb] --> B[projectWithRetrySingleBatch]
    B --> C{splitRetry.enabled\n&& all deterministic?}
    C -- yes --> D[sb.incRefCount\nrunWithSplitRetry]
    D --> E[withRetry sb\nsplitSpillableInHalfByRows]
    E --> F{OOM?}
    F -- split --> G[run projection\non each half]
    G --> H[collect pieces\nArrayBuffer]
    H --> I[buildNonEmptyBatchFromTypes\nconcat & close pieces]
    I --> J[return single ColumnarBatch]
    F -- no OOM --> K[run projection\non full batch]
    K --> J
    C -- no --> L[withRetryNoSplit path\nexisting logic]
    L --> J

_{Reviews (2): Last reviewed commit: "address coemments" | Re-trigger Greptile}

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

greptile-apps · 2026-05-05T07:37:47Z

+    val pieces = ArrayBuffer[ColumnarBatch]()
+    closeOnExcept(pieces) { _ =>
+      while (resultIter.hasNext) {
+        pieces += resultIter.next()
+      }
+      val outputTypes = (0 until pieces.head.numCols()).map { i =>
+        pieces.head.column(i).asInstanceOf[GpuColumnVector].dataType()
+      }.toArray
+      ConcatAndConsumeAll.buildNonEmptyBatchFromTypes(pieces.toArray, outputTypes)
+    }
+  }


Double-close of pieces when buildNonEmptyBatchFromTypes throws in the multi-batch path. ConcatAndConsumeAll.buildNonEmptyBatchFromTypes closes all input batches in its finally block (arrayOfBatches.foreach(_.close())). If it throws (e.g., OOM during Table.concatenate), both that finally and the closeOnExcept(pieces) handler call close on the same GpuColumnVector objects, driving the cuDF native ref-count negative. Narrow the closeOnExcept scope to cover only the collection loop and let buildNonEmptyBatchFromTypes take exclusive ownership for the concat step.

Suggested change

val pieces = ArrayBuffer[ColumnarBatch]()

closeOnExcept(pieces) { _ =>

while (resultIter.hasNext) {

pieces += resultIter.next()

}

val outputTypes = (0 until pieces.head.numCols()).map { i =>

pieces.head.column(i).asInstanceOf[GpuColumnVector].dataType()

}.toArray

ConcatAndConsumeAll.buildNonEmptyBatchFromTypes(pieces.toArray, outputTypes)

}

}

val pieces = ArrayBuffer[ColumnarBatch]()

closeOnExcept(pieces) { _ =>

while (resultIter.hasNext) {

pieces += resultIter.next()

}

}

val outputTypes = (0 until pieces.head.numCols()).map { i =>

pieces.head.column(i).asInstanceOf[GpuColumnVector].dataType()

}.toArray

// Transfer ownership: buildNonEmptyBatchFromTypes closes the batches itself

// (in its finally block). Do not wrap in closeOnExcept here to avoid a

// double-close if Table.concatenate throws.

ConcatAndConsumeAll.buildNonEmptyBatchFromTypes(pieces.toArray, outputTypes)

}

thirtiseven added 4 commits April 29, 2026 09:05

ProjectExec split and retry

3e0ba02

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

remove debug it

68ca3fe

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

refine

b55ab40

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

simplify

a5f6dbb

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

greptile-apps Bot reviewed May 5, 2026

View reviewed changes

Comment thread sql-plugin/src/main/scala/com/nvidia/spark/rapids/basicPhysicalOperators.scala Outdated

thirtiseven self-assigned this May 5, 2026

address coemments

6eb27e9

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

greptile-apps Bot reviewed May 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add split-and-retry path to GpuProjectExec#14724

Add split-and-retry path to GpuProjectExec#14724
thirtiseven wants to merge 5 commits intoNVIDIA:mainfrom
thirtiseven:project_split_retry

thirtiseven commented May 5, 2026

Uh oh!

greptile-apps Bot commented May 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

greptile-apps Bot May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thirtiseven commented May 5, 2026

Description

Checklists

Uh oh!

greptile-apps Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

greptile-apps Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented May 5, 2026 •

edited

Loading